The main objective of this systematic literature survey was to identify gaps and trends in a representative sample of published studies using deep-learning algorithms to analyse image data for animals species/individual/behaviour recognition and/or classification. To gain insights on recent developments aresented in academic literature, we focused on the journal articles and full-text conference proceedings published in the last 5 years (2017-2021).
We run a search in Scopus on 2021/10/10 using a pre-piloted search string (for details on the development including validation set refer a dedicated Notion notebook - ADD DETAILS LETER?):
( TITLE-ABS-KEY ( ( *automatic* OR “machine learning” OR “computer learning” OR “deep learning” OR “neural network*” OR “random forest*” OR “convolutional neural” OR “convolutional network*” OR “learning algorithm*” OR “Support Vector*” ) AND ( image* OR camera* OR video* OR vision ) AND ( *wild* OR population* OR “species identif*” OR “species label*” OR “species richness” OR ( bahavio* AND within/ 10 classif* ) OR ( bahavio* AND within/ 10 recogn* ) ) AND NOT ( “natural language” OR “sign language” OR accelomet* OR clinical* OR industr* OR agricult* OR farm* OR leaf OR husbandry OR food* OR tissue* OR cell* OR cultur* OR wildfire* OR “tree growth” OR forestry OR hydrolog* OR engineer* OR “oxygen species” OR molec* OR bacteria* OR microb* OR chemi* OR spectrom* OR brain* OR drug* OR patient* OR cancer* OR smoking OR disease OR diabet* OR landsat* OR sentinel OR satellite* OR “land cover” OR “land use” OR “vegetation map*” OR galax* OR “Google Earth” OR scan* OR “X-ray” OR “health care” OR participant* OR emotion* OR employee* OR speech OR proceedings ) ) ) AND PUBYEAR > 2016
The search retrieved 2,259 bibliographic records that were then downloaded and screened for inclusion.
Following PICO framework, we included articles if all criteria below were fulfilled:
Population: wild or semi-wild vertebrate species (exclude domestic or farmed animals, invertebrates, museum specimens).
Intervention / Innovation: use of computer vision machine learning algorithms (include neural-network type methods, such as deep learning, CNN), support vector, random forest) for automated or semi-automated processing of image data (e.g. from camera traps, video tracking, thermal imaging) at a scale where individual animals are visible (include aerial and drone images (exclude images gathered from satellites, biologing, X-ray, MRI images or equivalent).
Comparator / Context: images taken in the wild or semi-wild (includes zoo enclosures, excludes lab-based or agricultural/aquaculture/pet studies).
Outcomes: analyses focus on animal / species individual recognition/classification or animal behaviour recognition/classification.
Additional criteria: studies published in last 5 years (2017-2021), peer-reviewed (including full-text conference proceedings).
We used Rayyan QCRI software to screen 2,259 unique bibliographic records downloaded from Scopus. Two researchers (ML, JT) independently performed the screening assessing titles abstracts and keywords of each article. This screening resulted in 225 articles included for full-text assessment and data extraction.
Out of the 225 papers included, we obtained full-text for 215 papers.
For data extraction we used a two-part custom questionnaire implemented as a Google Form ( https://forms.gle/N7Hn9DVRjjmoKRd58). To pilot the form, we randomly selected 14 papers for independent screening aand extraction by three researchers (ML, JT, RF). We resolved disagreements by discussion until consensus was reached, and we refined the questionnaire form before the main round of full-text screening and data extraction.
One researcher (ML) performed full-text screening and data extraction for the remaining 195 papers. Second researcher (RF) cross-checked 58 of these papers for accuracy and to potentially resolve cases where information provided in the papers was unclear. After the full-text assessment, we extracted data from 192 studies.
| Question | Answer options |
|---|---|
| Paper’s title: | [text] |
| First author’s family name: | [text] |
| Publication year: | [number] |
| Journal name: | [text] |
| Article doi: | [text] |
| C1. Peer-reviewed empirical study | [yes; no; unsure/other] |
| Comment for C1 | |
| C2. Is full text available in English? | [yes; no; unsure/other] |
| Comment for C2 | |
| C3. Population: wild or semi-wild vertebrate species? | [yes; no; unsure/other] |
| Comment for C3 | |
| C4. Intervention / Innovation: use of computer vision machine learning algorithms (for automated or semi-automated processing of image data at a scale where individual animals are visible)?: | [yes; no; unsure/other] |
| Comment for C4 | |
| C5. Comparator / Context: are the studied animals in the wild or semi-wild? comment for C5 | [yes; no; unsure/other] |
| C6. Outcomes: focus on animal / species individual recognition / classification or animal behaviour recognition / classification ?: | [yes; no; unsure/other] |
| Comment for C6 | |
| Q1. Number of studied species | [number] |
| Comment for Q1 | |
| Q2. Study species (Latin name) | [text] |
| Comment for Q2 | |
| Q3. Studied species group: | [mammals; birds; reptiles; amphibians; fishes; other/unclear]* |
| Comment for Q3 | |
| Q4. Used image type source: | [camera trap or surveillance camera (fixed); aerial (including drone); hand camera (or mobile phone camera); other/unclear]* |
| Comment for Q4 | |
| Q5. Study context or setting: | [wild; semi-wild; unclear/other]* |
| Comment for Q5 | |
| Q6. Location country/region: | [text] |
| Q7. Location details: | [text] |
| Q8. Algorithm type: | [Neural Network; Random forest; Gradient boosting model; Support Vector Machines; Rule-based learners; Decision trees; K-Nearest Neighbour; unclear/other]* |
| Q9. Outcome type: | [counting individuals (at given time); individual recognition (re-identification); species recognition/classification (class/object detection); behaviour detection (at given time); tracking (following through space); behaviour classification (changes over time); unclear/other]* |
| Q10. Analysis code | [yes; no; unclear/other] |
Note: * indicates plural variables (i.e. more than one answer option can be chosen).
Each question in the data extraction form (Table S1) is followed by a dedicated comment field used to record any additional details, including relevant quotes from the paper. We excluded any papers that were coded as “no” at questions C1 to C6 (full-text screening questions - whether the paper fulfills our inclusion criteria), i.e. these papers were not subject to any further data extraction and analyses.
After data extraction additional columns were added to the data table with the following data:
- Q7_coordinates: latitude and longitude of the study location, as in the paper or from Google Maps, if not reported
- Q7_location_unclear: 0 = clear (location at least at the level of national park, state, province, city, or equivalent - reported in the article or inferred from the data set name); 1 = unclear, location either not reported or cannot be assigned to a specific location (e.g., global data, broad regions such as Arctic, Northern Atlantic, Africa, America)
- Checked: whether record was cross-checked by an indpendent researcher
- Checking_comments: any comments from data extraction checking
- Changed: whether record was changed after cross-checking
- Changed_comment: how record was changed after cross-checking
- Pilot: whether study was used in the piloting phase
- Included: whether study was included in the final data set for extraction
- Exclusion reason: main reason for excluding study from the final data set for extraction, if excluded
Out of the 215 full-text articles screened, 192 were deemed eligible for data extraction (Table S2). The data extraction spreadsheet is stored as mapping_dataset_reconciled.xlsx. Below, we present a summary of the extracted data.
List of articles excluded at full-text screening, with main reasons for exclusion.
List of included articles with key bibliographic information.
Data cleaning before generating summaries and plotting.
A barplot of the counts of publications in different journals, with top 20 shown sorted by descending frequency order.
To infer discipline / audience type we categorized publicatio journals as: computer science / technology, ecology, multidisciplinary.
Most data sets have prespecified number of animal species / classes present. Class can represent a species or a higher taxonomic group, such as genus, family, order, super-order, etc. (even “animals” can ba a class). Classes of non-animal objects (e.g. humans, vehicles) were not counted. When more than one dataset was used, the number was extracted for the biggest dataset.
As a histogram with numbers of classes on a log x-scale due to strong right-skew in the data.
As a barplot displaying actual values of the numbers of species / classes.
For studies focusing on a single animal species, we extracted species name to investigate which particular species were most popular (subspecies names are omitted from the plot labels).
Most popular types of animals as represented by commonly used “biological” categories. One study could be coded as studging one or more categories of animals, e.g. both mammals and birds. However, distribution of number of species within multi-category studies was often not even, e.g. a commonly used Serengeti dataset from tanzania is dominated by large mammals with only a few species of large birds considered in analyses.
Image sources were categorized by the type of the hardware used to collect image data: fixed survelliance/trap cameras (often activated by movement, or continuously recording), hand-held devices including mobile phones, or device mounted on aerial vehicles including drones. Where it was not clearly reported in a paper, we inferred the image source from the example images from the analysed dataset or from descriptions of the dataset in other publications. A single study could be coded as using one or more categories of image sources, e.g. mix of camera traps and hand-held cameras.
Settings of teh images used were classified as wild or semi-wold (outdoor enclusures for wild animals). A single study could be coded as using one or more categories of settings, e.g. mix of images from te wild and captive animals.
Country or a larger region where animal images were collected. A single study could be coded as using images from one or more countries/regions. Some studies using images of captive animals kept in zoos likely across mutiple countries were coded as “global” (often images sourced from the Internet/social platforms).
A barplot of the counts of articles originating form a given country / larger region. “Global” are usually datasets based on images collected from the Internet or social media.
A choropleth map of the counts of articles based on a dataset originating form a given country. Data gathered from larger than country regions (e.g. oceans, continents, global) are not shown.
Location coordinates represent either a specific location (green circles) or centroids of a broader region (orange circles) animal images originated from. Darker circles indicate a larger number of studies using images from a given location. Global image datasets (e.g. gathered from the Internet or social media) are not shown.
Barplot of the main types of machine learning algorithms used. A single study could be coded as using one or more types.
Barplot of the main types of outcomes / purposes of analyses used. A single study could be coded as using one or more types.
Barplot of the analysis code availability. Code was coded as available when a link to a code repository was provided in the article.
A set of draft plots using information from two or more extracted variables. To be refined.
A heatmap showing crosstabulation of the main types of machine learning algorithms and analysis outcomes / purposes. A single study could be coded as using one or more types for both variables.
A heatmap showing crosstabulation of the study publication year and analysis outcomes / purposes. A single study could be coded as using one or more types for the analysis outcomes / purposes.
A stacked area chart showing crosstabulation of the study publication year and analysis outcomes / purposes. A single study could be coded as using one or more types for the analysis outcomes / purposes.
A heatmap showing crosstabulation of the study publication year and the main types of machine learning algorithms. A single study could be coded as using one or more types for the types of machine learning algorithms.
A stacked area plot showing crosstabulation of the study publication year and the main types of machine learning algorithms. A single study could be coded as using one or more types for the types of machine learning algorithms.
A stacked area plot showing yearly changes in proportions of the main types of machine learning algorithms used. A single study could be coded as using one or more types for the types of machine learning algorithms.
A stacked area plot showing yearly changes in the counts of publications by journal discipline.
A barplot of species names for studies focusing on re-identification of individuals. Species names were extracted only from papers focusing on a single species (i.e. data not shown for 12 multi-species studies).
Phylogenetic tree of species used in studies focusing on re-identification of individuals. Species names were extracted only from papers focusing on a single species (i.e. data not shown for 12 multi-species studies). Using rotl R package (https://peerj.com/preprints/1471/) allowing access to synthetic phylogenetic tree available at the Open Tree of Life database (https://opentreeoflife.org/).
##
Progress [---------------------------------] 0/136 ( 0) ?s
Progress [==============================] 136/136 (100) 0s
Overlay organism silhouettes
Note: phylopic.org hosts free silhouette images of animals, plants, and other life forms, all under Creative Commons or Public Domain. Also using colours to indicate biological groups across the plot.
## quartz_off_screen
## 2
Note: Plotting to pdf causes some of the animal silhouettes to have distorted height:width ratios - these will be fixed manually in Adobe Illustrator (saved as Figure2_tree_Ai.pdf)
Table of counts of studies with unclear or missing data for key extracted variables (as applicable).
A stacked barplot of counts of studies with unclear and missing data for key extracted variables (as applicable).
These analyses are based on the information extracted from bibliographic records downloaded from Scopus. Initial preprocessing and summaries using bibliometrix R package. Subsequently this data was combined with manually coded data from the full texts.